Search CORE

10 research outputs found

Recommended from our members

Artificial intelligence system for continuous affect estimation from naturalistic human expressions

Author: Abd Gaus Yona Falinie
Publication venue: Brunel University London
Publication date: 01/01/2018
Field of study

This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonThe analysis and automatic affect estimation system from human expression has been acknowledged as an active research topic in computer vision community. Most reported affect recognition systems, however, only consider subjects performing well-defined acted expression, in a very controlled condition, so they are not robust enough for real-life recognition tasks with subject variation, acoustic surrounding and illumination change. In this thesis, an artificial intelligence system is proposed to continuously (represented along a continuum e.g., from -1 to +1) estimate affect behaviour in terms of latent dimensions (e.g., arousal and valence) from naturalistic human expressions. To tackle the issues, feature representation and machine learning strategies are addressed. In feature representation, human expression is represented by modalities such as audio, video, physiological signal and text modality. Hand- crafted features is extracted from each modality per frame, in order to match with consecutive affect label. However, the features extracted maybe missing information due to several factors such as background noise or lighting condition. Haar Wavelet Transform is employed to determine if noise cancellation mechanism in feature space should be considered in the design of affect estimation system. Other than hand-crafted features, deep learning features are also analysed in terms of the layer-wise; convolutional and fully connected layer. Convolutional Neural Network such as AlexNet, VGGFace and ResNet has been selected as deep learning architecture to do feature extraction on top of facial expression images. Then, multimodal fusion scheme is applied by fusing deep learning feature and hand-crafted feature together to improve the performance. In machine learning strategies, two-stage regression approach is introduced. In the first stage, baseline regression methods such as Support Vector Regression are applied to estimate each affect per time. Then in the second stage, subsequent model such as Time Delay Neural Network, Long Short-Term Memory and Kalman Filter is proposed to model the temporal relationships between consecutive estimation of each affect. In doing so, the temporal information employed by a subsequent model is not biased by high variability present in consecutive frame and at the same time, it allows the network to exploit the slow changing dynamic between emotional dynamic more efficiently. Following of two-stage regression approach for unimodal affect analysis, fusion information from different modalities is elaborated. Continuous emotion recognition in-the-wild is leveraged by investigating mathematical modelling for each emotion dimension. Linear Regression, Exponent Weighted Decision Fusion and Multi-Gene Genetic Programming are implemented to quantify the relationship between each modality. In summary, the research work presented in this thesis reveals a fundamental approach to automatically estimate affect value continuously from naturalistic human expression. The proposed system, which consists of feature smoothing, deep learning feature, two-stage regression framework and fusion using mathematical equation between modalities is demonstrated. It offers strong basis towards the development artificial intelligent system on estimation continuous affect estimation, and more broadly towards building a real-time emotion recognition system for human-computer interaction.Majlis Amanah Rakyat (MARA), Malaysi

Brunel University Research Archive

Automatic depression scale prediction using facial expression dynamics and regression

Author: Asim Jan
Binti Abd Gaus
Fan Zhang
Hongying Meng
Saeed Turabzadeh
Yona Falinie
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Depression is a state of low mood and aversion to activity that can affect a person's thoughts, behaviour, feelings and sense of well-being. In such a low mood, both the facial expression and voice appear different from the ones in normal states. In this paper, an automatic system is proposed to predict the scales of Beck Depression Inventory from naturalistic facial expression of the patients with depression. Firstly, features are extracted from corresponding video and audio signals to represent characteristics of facial and vocal expression under depression. Secondly, dynamic features generation method is proposed in the extracted video feature space based on the idea of Motion History Histogram (MHH) for 2-D video motion extraction. Thirdly, Partial Least Squares (PLS) and Linear regression are applied to learn the relationship between the dynamic features and depression scales using training data, and then to predict the depression scale for unseen ones. Finally, decision level fusion was done for combining predictions from both video and audio modalities. The proposed approach is evaluated on the AVEC2014 dataset and the experimental results demonstrate its effectiveness.The work by Asim Jan was supported by School of Engineering & Design/Thomas Gerald Gray PGR Scholarship. The work by Hongying Meng and Saeed Turabzadeh was partially funded by the award of the Brunel Research Initiative and Enterprise Fund (BRIEF). The work by Yona Falinie Binti Abd Gaus was supported by Majlis Amanah Rakyat (MARA) Scholarship

CiteSeerX

Crossref

Brunel University Research Archive

Hidden Markov Model -based Gesture Recognition with Overlapping Hand- Head/Hand-Hand Estimated using Kalman Filter

Author: Abdul Gaus
Farrah Wong
Yona Falinie
Publication venue
Publication date: 05/03/2020
Field of study

Abstract---In this paper, we introduce a hand gesture recognition system to recognize isolated Malaysian Sign Language (MSL). The system consists of four modules: collection of input images, feature extraction, Hidden Markov Model (HMM) training, and gesture recognition. First, we apply skin segmentation procedure throughout the input frames in order to detect only skin region. Then, we proceed to feature extraction process consisting of centroids, hand distance and hand orientation collecting. Kalman Filter is used to identify the overlapping hand-head or hand-hand region. After having extracted the feature vector, the hand gesture trajectory is represented by gesture path in order to reduce system complexity. We apply Hidden Markov Model (HMM) to recognize the input gesture. The gesture to be recognized is separately scored against different states of HMMs. The model with the highest score indicates the corresponding gesture. In the experiments, we have tested our system to recognize 112 MSL, and the recognition rate is about 83%

CiteSeerX

Hidden Markov Model-Based Gesture Recognition with Overlapping Hand-Head/Hand-Hand Estimated Using Kalman Filter

Author: Farrah Wong
Yona Falinie Abdul Gaus
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

In this paper, we introduce a hand gesture recognition system to recognize isolated Malaysian Sign Language (MSL). The system consists of four modules: collection of input images, feature extraction, Hidden Markov Model (HMM) training, and gesture recognition. First, we apply skin segmentation procedure throughout the input frames in order to detect only skin region. Then, we proceed to feature extraction process consisting of centroids, hand distance and hand orientation collecting. Kalman Filter is used to identify the overlapping hand-head or hand-hand region. After having extracted the feature vector, the hand gesture trajectory is represented by gesture path in order to reduce system complexity. We apply Hidden Markov Model (HMM) to recognize the input gesture. The gesture to be recognized is separately scored against different states of HMMs. The model with the highest score indicates the corresponding gesture. In the experiments, we have tested our system to recognize 112 MSL, and the recognition rate is about 83%

Crossref

UMS Institutional Repository

Electrical characterization and source-drain voltage dependent mobility of P-channel organic field-effect transistors using MATLAB simulation

Author: Afishah Alias
Ahmad Mukifza Harun
Bablu Kumar Ghosh
Ismail Saad
Khairul Anuar Mohamad
Umar Farok
Yona Falinie
Publication venue
Publication date: 01/01/2014
Field of study

We demonstrate fabrication of bottom gate/top source-drain contacts for p-channel (small molecule) organic field-effect transistor (OFET) using pentacene as an active semiconductor layer and silicon dioxide (SiO2) as gate dielectric. The device exhibits a typical output curve of a field-effect transistor (FET). Furthermore, analysis of electrical characterization was done to investigate the source-drain voltage (Vds) dependent mobility. The mobility which calculated using MATLAB simulation exhibited a range from 0.0234 to 0.0258 cm2/Vs with increasing source-drain voltage (average mobility was 0.0254 cm2/Vs). This work suggests that the mobility increase with increasing source-drain voltage similar to the gate voltage dependent mobility phenomenon

UMS Institutional Repository

Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers

Author: Bhowmik Neelanjan
Bond-Taylor Sam
Breckon Toby P.
Corona-Figueroa Abril
Gaus Yona Falinie A.
Shum Hubert P.H.
Willcocks Chris G.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/07/2023
Field of study

Generating 3D images of complex objects conditionally from a few 2D views is a difficult synthesis problem, compounded by issues such as domain gap and geometric misalignment. For instance, a unified framework such as Generative Adversarial Networks cannot achieve this unless they explicitly define both a domain-invariant and geometric-invariant joint latent distribution, whereas NeuralRadiance Fields are generally unable to handle both issues as they optimize at the pixel level. By contrast, we propose a simple and novel 2D to 3D synthesis approach based on conditional diffusion with vector-quantized codes. Operating in an information-rich code space enables highresolution 3D synthesis via full-coverage attention across the views. Specifically, we generate the 3D codes, e.g. for CT images, conditional on previously generated 3D codes and the entire codebook of two 2D views (e.g. 2D X-rays). Qualitative and quantitative results demonstrate state-of-the-art performance over specialized methods across varied evaluation criteria, including fidelity metrics such as density and coverage and distortion metrics for two datasets of complex volumetric imagery found in real-world scenarios

Durham Research Online